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Abstract 

We reexamine fundamental problems from computational geometry in the word RAM model, 
where input coordinates are integers that fit in a machine word. We develop a new algorithm 
for offline point location, a two-dimensional analog of sorting where one needs to order points 
with respect to segments. This result implies, for example, that the convex hull of n points in 
three dimensions can be constructed in (randomized) time n ■ 2°^ igign K Similar bounds hold 
for numerous other geometric problems, such as planar Voronoi diagrams, planar off-line nearest 
neighbor search, line segment intersection, and triangulation of non-simple polygons. 

In FOCS'06, we developed a data structure for online point location, which implied a bound 
of 0(n lg ^ n ) for three-dimensional convex hulls and the other problems. Our current bounds 
are dramatically better, and a convincing improvement over the classic O(nlgn) algorithms. 
As in the field of integer sorting, the main challenge is to find ways to manipulate information, 
while avoiding the online problem (in that case, predecessor search). 

1 Introduction 

1.1 Sorting in Two Dimensions 

Consider the following toy problem (in fact, a special case of offline point location), which we call 
the slab problem. We are given a vertical slab in the plane, m nonintersecting segments cutting 
across the slab, and n points in the slab. The goal is to identify the segment immediately below 
each of the n points. In other words, we would like to sort the points "relative to" the segments. 

This is an appealing and natural generalization to two dimensions of the one-dimensional notion 
of sorting. It captures both an intuitive notion of ordering, and the non-orthogonal flavor so common 
in computational geometry. Indeed, as described below, an impressive collection of fundamental 
problems in computational geometry are known to be reducible to this simple problem, so there is 
a formal sense in which the slab problem is as central in computational geometry as sorting is in 
the one-dimensional world. 

*A preliminary version of this work with the title "Voronoi Diagrams in n ■ 2°^ ls lg Time" appeared in Proc. 
39th ACM Symposium on Theory of Computing, pages 31-39, 2007. 
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Classically, the slab problem is solved by binary searching among segments for each input point, 
for a cost of O(lgm) per point. This is optimal when one searches by binary decisions or assumes 
the input has infinite precision, as on a real RAM. However, a more reasonable assumption is that 
input has finite precision. We will assume, in particular, that all coordinates come from some 
universe [2 W ] = {0, . . . , 2 W — 1}, and that we are working on a word RAM with w-bit words (i.e. one 
coordinate fits in one word). See Section [1.31 for a discussion of these assumptions. 

Until recently, successful use of the word RAM in computational geometry was limited to a re- 
stricted class of problems, especially problems involving orthogonal objects. However, in FOCS'06, 
we proposed improved data structures for the online slab problem, a problem of a fundamentally 
nonorthogonal nature [8j. The running time was asymptotically min | jgf^ , } P er point. 

This represents a marginal improvement over O(lgm) for any universe, and a roughly quadratic 
improvement for small (polynomial) universes. 

In the current paper, we describe an algorithm for the (offline) slab problem running in time 
n . 2°(vS ig m ) _|_ 0(m). Note that this bound does not depend on the universe (aside from assuming 
a coordinate fits in a word), and is deterministic. The bound is a dramatic improvement over our 
old bounds — note, for example, that the new bound grows more slowly than n \g £ m + m for any 
constant e > 0. In addition, the new bound represents a much more convincing improvement over 
the standard O(nlgm) bound, demonstrating the power granted by bounded precision. 

The relation between our current algorithm and our results from [8] is best understood by a 
parallel to integer sorting. There, the online problem (predecessor search) is known to require 

comparatively large running times (e.g. in terms of n alone, an £l(y Igfg^) l° wer bound per point 
is known [4j). Yet, one can find ways of manipulating information in the offline problem, such that 
the bottleneck of using the online problem is avoided (e.g. we can sort in O(nyTglgn) expected 
time [14J). It should be understood that the purpose of this work is not to study "bit tricks" in 
the word RAM model, but to study how information about points and lines can be decomposed in 
algorithmically useful ways. 

1.2 Applications 

From [8] it follows that improved bounds for the slab problem lead to improved upper bounds for 
many fundamental problems in computational geometry [10 \ \U \ ITS ! 117 1 H8J. We list some here. As 
before, the bounds do not depend on the universe for the coordinates. All the reductions below, 
except the last, are randomized. 

1. We can compute the convex hull of n points in three dimensions in expected time n-2°^ lg lgn ) . 
If the hull has H vertices, the bound can be reduced to n • 2°(v / IgTp7). 

2. We can compute the Voronoi diagram and the Delaunay triangulation of n points in the 
plane in expected time n ■ 2°^ ls lg n \ As a consequence, we can also compute the Euclidean 
minimum spanning tree or solve the largest empty circle problem within the same time bound. 

3. Given n red points and n blue points in the plane, we can compute the red point nearest to 
each blue point in expected time n ■ 2°^ lg lg n \ 

4. We can compute all K intersections of n line segments in the plane in expected time n ■ 
2 0(vlglgn) + ( i T). We can also construct the trapezoidal decomposition of the line segments 
within the same time bound. 
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5. We can triangulate a polygon with holes (or an environment with multiple disjoint polygons) 
with n vertices total in time n ■ 2°^ le lgn ) . 

Problems like convex hulls and Voronoi diagrams date back to the dawn of computational 
geometry, and for these problems standard O(nlgn) bounds have long been regarded as "optimal." 
The previous paper [8] has demonstrated that on the word RAM, O(nlgn) can be beaten slightly, 
by 0(n ^™ n ) bounds. The current paper shows that O(nlgn) can be improved significantly. 

Planar point location. The slab problem is actually a special case of the offline planar point 
location problem. Here, the input consists of a connected polygonal subdivision defined by a set of 
m segments in the plane, where segments are only allowed to touch at endpoints. Given n query 
points (offline), the goal is to identify the polygon (face) which contains each point. 

In fact, reductions discussed above are to the offline planar point location problem. However, 
the general case of point location turns out to be reducible to the special case of the slab problem. 
In [8], we considered three different ways to achieve this reduction. These three approaches hold 
both in the offline and online cases. They all generate a multiplicative penalty of at most 0(lg lg m) 
per point, which is absorbed by our time bound, but differ in the cost per segment: 

(i) random sampling gives a randomized (n + m) ■ 2°^ ls lg m ) running time. 

(ii) persistence and exponential trees give a deterministic n ■ 2°(^ /lglgm ) + 0(sort(m)) bound, 
where sort(m) denotes the cost of sorting m integers. 

(iii) planar separators are the most complicated (and least practical) strategy but give the best 
deterministic running time of n ■ 2°^ lglgm ) + O(m). 

Higher dimensions. We can also solve the analog of the slab problem in any constant dimension. 
Instead of a slab, we are given a vertical prism, and instead of segments, we are given hyperplanes 
cutting across the prism where no two hyperplanes intersect inside the prism. The running time of 
our solution is n • 2°^ lg lg m ) lg 1 "^ ^ 1 ) w + 0(m). Although the bound now has an extra lg 1 ^ 1 ) w 
factor, this factor is relatively small. This result has a few applications as well, for example, to 
offline exact nearest neighbor search in higher dimensions and curve-segment intersection in two 
dimensions (see [8]). 

Recent work. Subsequent to our conference publication, Buchin and Mulzer [6] announced a 
better result for planar Voronoi diagrams. They show that constructing Voronoi diagrams is equiv- 
alent to sorting, in the Word RAM augmented with one non-standard operation. In the standard 
Word RAM, they obtain an unconditional running time of 0{n^\g lgn) by adapting the best known 
sorting algorithm of [14J. 

This improves the running time of all problems mentioned in item 2. above. However, this 
algorithm exploits special properties about Voronoi diagrams and nearest neighbor graphs, and 
does not imply improved results for the other problems considered here, such as 3-d convex hulls 
and offline point location. 
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1.3 Computational Geometry on a Grid 

It is superfluous to state that bounded precision is a fact of life. Input data are given with finite 
precision and computers represent it internally in a bounded number of bits. Low-dimensional 
computational geometry has typically seen the negative consequences of this state of affairs. Algo- 
rithms are designed in idealized models of real arithmetic, and practitioners struggle to keep the 
algorithms working with imperfect precision. 

Yet, there is also a good side to bounded precision, and our result shows it can be used to achieve 
significantly better algorithms. Again, we emphasize that our improved bound is independent of 
whatever the bound on the precision happens to be. We just require that coordinates can be 
manipulated in constant time, which has always been a standard assumption. 

The philosophical question that we wish to address briefly is whether these benefits of bounded 
precision should be explored in theory. We believe firmly they should, examining the question both 
with an eye to practice and to theory. 

Practice. From a theoretical perspective, the classic solutions to online point location using linear 
space and logarithmic query time would seem attractive. However, as pointed out in a survey on the 
topic [20], the most efficient and popular practical implementations do not use them. Instead, they 
use grid-pruning heuristics, not unlike some of our ideas. Thus, we can hope to gain theoretical 
understanding for what is already known to be effective in the real world — a standard goal for 
theory. 

Turning this around, we can hope that theoretical improvements will suggest new ideas with an 
impact in practice. As presented, our results are theoretical because of large hidden constants, in 
both the slab problem and subsequent reductions. However, since the improvement over O(nlgn) 
is now quite significant, we find it plausible that some of the techniques developed here can provide 
inspiration for useful practical "tricks." A key step would be to circumvent the reductions and 
apply our techniques directly to target problems. 

Theory. Even at theory's end of computational geometry, the assumption of bounded precision 
has been used fairly often. Unfortunately, this body of work (see the bibliographies of [8]) has 
been plagued by close ties to one-dimensional problems. For example, two-dimensional convex 
hulls can be found in linear time once points are sorted by x-coordinate. More typically, the 
problems considered involved orthogonal objects, and such orthogonal problems can more easily be 
decomposed into one-dimensional problems. 

Our recent data structures for point location [8] broke this barrier, by presenting an improvement 
for a fundamentally nonorthogonal problem. The current paper tries to demonstrate that there 
are deeper questions to be explored in this direction of research. In our algorithm, we are forced 
to consider questions of decomposability and compressibility of information about two-dimensional 
objects, which seem fundamentally different from questions in one dimension. We feel this should 
have a theoretical appeal in itself. 

For example, consider two standard tools in sorting. One is radix sort, which gives linear-time 
sorting in polynomial universes. Another is hashing, which is used in virtually all advanced RAM 
sorting algorithms, including [2j [3j [121 [13J [TU [I5l [21]. In two dimensions, neither of these tools 
seems relevant. It is interesting that even without such basic tools, we can still obtain a rather 
efficient, determinstic algorithm (even outperforming some of the older RAM sorting results). 
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1.4 Overview 



The remainder of this paper is organized as follows. In Section [21 we describe a simple algorithm for 
the slab problem, running in time 0(ny/lgm + m). This demonstrates the basic divide-and-conquer 
strategy behind our solution. In Section[3l we implement this strategy much more carefully to obtain 
an interesting recurrence that ultimately leads to the stated time bound of n ■ 2°^ lg lgm ) + 0(m). 
The challenges faced by this improvement are similar to issues in integer sorting, and indeed we 
borrow (and build upon) some tools from that field. 

Unfortunately, the implementation of Section [3] requires a nonstandard word operation. In 
Section HI we describe how to implement the algorithm on a standard word RAM, using only ad- 
dition, multiplication, bitwisedogical operations, and shifts. Interestingly, the new implementation 
requires some new geometric observations that affect the design of the recursion itself. 

2 An Initial Algorithm for the Slab Problem 

2.1 The Basic Recursive Strategy 

We begin with a recursive strategy based on a simple observation, taken from [8]. (Later in Section 
HI we will replace this with a more complicated recursive structure.) In the following, the notation 
-< refers to the belowness relation. 

Observation 1. Fix b and h. Let S be a set of m sorted disjoint segments, where all left endpoints 
lie on an interval II of length 2^ L on a vertical line, and all right endpoints lie on an interval Ir 
of length 2 1r on another vertical line. In 0(b) time, we can find 0(b) segments sq,si, . . . G S in 
sorted order, which include the lowest segment of S, such that: 

(1) for each i, at least one of the following holds: 

(la) there are at most m/b segments of S between Si and Sj+i. 

(lb) the left endpoints of Sj and Sj+i lie on a subinterval of length 2^ L ~ h . 

(lc) the right endpoints of Sj and Sj+i lie on a subinterval of length 2 iR ~ h . 

(2) there exist segments sq,§2,--- cutting across the slab, satisfying all of the following: 
(2a) s -< so -< $2 -< h -< ■ ■ ■ ■ 

(2b) distances between the left endpoints of the Si's are all multiples of2^ L ~ h . 
(2c) distances between right endpoints are all multiples of2 iR ~ h . 

Proof: Let B contain every |_rri/fcij -tti segment of S, starting with the lowest segments sq. Impose 
a grid over II consisting of 2 h subintervals of length 2 iL ~ h , and a grid over In consisting of 2 h 
subintervals of length 2^ R ~ h . We define Sj+i inductively based on s», until the highest segment is 
reached. We let Sj+i be the highest segment of B such that either the left or the right endpoints 
of Si and Sj+i are in the same grid subinterval. This will satisfy (lb) or (lc). If no such segment 
above s, exists, we simply let Sj + i be the successor of Sj, satisfying (la). (See Figure [TJ) 

Let Si be obtained from Sj by rounding each endpoint to the grid point immediately above 
(ensuring (2b) and (2c)). By construction of the Sj's, both the left and right endpoints of Sj and 
Sj+2 are in different grid subintervals. Thus, s» -< Sj + 2, ensuring (2a). □ 
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Figure 1: Proof of Observation [TJ an example (with £l = £r). The left diagram shows the segments 
in B. 



Slab(Q,5): 

0. if m = 0, set all answers to NULL and return 

1. let So, si, . . . be the 0{b) segments from Observation Q] 

2. let ip be the projective transform mapping J x to {0} x [0,2 h ] and I R to {2^} x [0,2 fc ]. 

Compute ROUND((^(Q)) and <p(Sq), f(s2), ■ ■ ■ 

3. Slab (round((^(Q)), {(p(s ),<p(s 2 ),---}) 

4. for each q G Q with ans[round((^(q))] = ip(5i) do 

set ANS[g] = the segment from {sj_4, . . . , Sj+7} immediately below q 

5. for each Sj do 

Slab({<j G Q I ANS[g] = Si}, {s G 5 | Sj -<! s -< Sj+i}) 

Figure 2: A recursive algorithm for the slab problem. Parameters b and h are fixed in the analysis; 
round(-) maps a point to its nearest integral point. 
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The above observation naturally suggests a recursive algorithm. In the pseudocode in Figure [21 
the input is a set Q of n points and a sorted set S of m disjoint segments, where the left and right 
endpoints lie on intervals II and Ir of length 2^ L and 2^ R respectively. At the end, ANs[g] stores 
the segment from S immediately below q for each q £ Q. A special null value for ANS[g] signifies 
that q is below all segments. We assume a (less efficient) procedure Slabo(Q, <S), with the same 
semantics as Slab(<5,5'), which is used as a bottom case of the recursion. The choice of SlaboQ 
is a crucial component of the analysis. 

We first explain why the pseudocode works. In step 2, an explicit formula for the transform (p 
has already been given in [8]; this mapping preserves the belowness relation. According to property 
(2) in Observation [TJ we know that the transformed segments (p(s~o)i f(s2), ... all have h-bit integer 
coordinates from [2 h ]. After rounding, the n points <p(Q) will lie in the same universe. 

Any unit square can intersect at most two of the ^(s^'s, since these segments have vertical 
separation at least one and thus horizontal separation at least one (as slopes are between —1 and 
1). If if(si) ~< ROUND((^(g)) -< ip{si + 2), then we must have ip(si-i) -< (p(q) ~< <f(si + e), implying 
that Sj_4 -< Sj_4 -< q ~< s~i + Q ~< Si + s- Thus, at step 4, ANS[g] contains the segment from sq,s\ . . . 
immediately below q. Once this is determined for every point q S Q, we can recursively solve the 
subproblem for the subset of points and segments strictly between Sj and Sj+i for each i, as is done 
at step 5. An answer ANs[g] = null from the i-th subproblem is interpreted as ANS[g] = Sj. 

Let £ = (£ L +£ R )/2 (£ < w). Denote by T(n,m,£) the running time of Slab(), and T (n,b',h) 
the running time of the call to SlaboQ in step 3. Steps 1, 2 and 4 can be implemented naively in 
0(n + m) time. We have the recurrence: 

b' 

T(n,m,£) = T (n,b',h) + 0(n + m) + y^T{n^mi,lj), (1) 

»=o 

where b' = 0(b), ^ n« = n, Y2i m i = m — b'. Furthermore, according to property (1) in Observa- 
tion [H for each i we either have rrn < y or £{ < £ — | . This implies that the depth of the recursion 
is 0(log 6 m + |). 

2.2 An 0(ny/\gm + m) Algorithm 

In the previous paper [8], we have noticed that for b'h ~ w, Slabo() can be implemented in 
To(n,b' ,h) = 0(n) time by packing b' segments from an h-bit universe into a word. By setting 
b rs log e m and h ~ wj log e m, this leads to an 0((n + m) ^g^ ) algorithm. 

Instead of packing multiple segments in a word, our new idea is to pack multiple points in a 
word. To understand why this helps, remember that the canonical implementation for SlaboQ 
runs in time O(nlgm) by choosing the middle segment and recursing on points above and below 
this segment. By packing t segments in a word, we can hope to reduce this time to 0(nlog t m). 
However, by packing t points in a word, we can potentially reduce this to 0(j Igm), a much bigger 
gain. (One can also think of packing both points and segments, for a running time of 0(j log t m). 
Since we will ultimately obtain a much faster algorithm, we ignore this slight improvement.) 

To implement this idea, step 2 will pack round(9j(Q)) with 0(w/h) points per word. Each 
point is alotted 0(h) bits for the coordinates, plus lg b = 0(h) bits for the answer ans[round((/9(q))] 
which SlaBqO must output. This packing can be done in 0(n) time, adding one point at a time. 
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Working on packed points, SlaboQ has the potential of running faster, as evidenced by the 
following lemma. For now, we do not concern ourselves with the implementation on a word RAM, 
and assume nonstandard operations (an operation takes two words as input, and outputs one word). 

Lemma 2. Iflgb < h < w, SlaBoQ can be impemented on a RAM with nonstandard operations 
with a running time of ?o(n, b, h) = 0(n^ lg b + b) . 

Proof: Given a segment and a number of points packed in a word, we can postulate two operations 
which output the points above (respectively below) the segment, packed consecutively in a word. 
Choosing a segment, we can partition the points into points above and below the segment in 
0([ra^]) time. In the same asymptotic time, we can make both output sets be packed with 
points per word (merging consecutive words which are less than full). 

We now implement the canonical algorithm: partition points according to the middle segment 
and recurse. As long as we are working with > j- points, the cost is O(^) per point for each 
segment, and each point is considered O(lgfe) times. If we are dealing with less than j- points, the 
cost is 0(1), and that can be charged to the segment considered. Thus, the total time after packing 
isO{n±lgb + b). 

The last important issue is the representation of the output. By the above, we obtain the sets 
of points which lie between two consecutive segments. We can then trivially fill in the answer for 
every point in the lg 6 bits alotted for that. However, we want an array of answers for the points 
in the original order. To do that, we trace the algorithm from above backwards in time. We use 
an operation which is the inverse of splitting a word into points above and below a segment. □ 

Plugging the lemma into ([TJ), we get T(n, m, £) = 0(n^ lg b + n + m) ■ ©(log;, m + j-). Setting 
lg 6 = y/lgm and h = w/\/lgm, we obtain T(n,m,w) = 0{{n + m)\J\g m). This can be improved to 
0{m+nyj\gm) by the standard trick of considering only one in 0(\/\gm) consecutive segments. For 
every point, we finish off by binary searching among 0{\J\gm) segments, for a negligible additional 
time of 0(n lglgm). 

3 Ann- 2 ^ 1 ^ + 0(m) Algorithm 
3.1 Preliminaries 

To improve on the 0(m + n^/[gm) bound, we bootstrap: we use an improved algorithm for SlabQ 
as SlaBoQ, obtaining an even better bound for SlabQ. To enable such improvements, we can 
no longer afford the 0(n) term in the recurrence ([I]). Rather, a call to SlabQ is passed Q in 
word-packed form, and we want to implement the steps between recursive calls in su&linear time 
(close to the number of words needed to represent Q, not to n = \Q\). 

This task will require further ideas and more sophisticated word-packing tricks. To understand 
the complication, let us contrast implementing steps 2 and 5 of SlabQ in sublinear time. Computing 
ROUND(</?((5)) in Step 2 is solved by applying a function in parallel to a word-packed vector of points. 
This is certainly possible, at least using nonstandard word operations. However, step 5 needs to 
group elements of Q into subsets (i.e. sort Q according to ans[q]). This is a deeper information- 
theoretic limitation, and it is rather unlikely that it can always be done in time linear in the number 
of words needed to store Q. The problem has connections to applying permutations in external 
memory, a well-studied problem which is believed to obey similar limitations pQ. 
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To implement step 5 (and also step 4), we will use a subroutine Split(Q, label). This receives 
a set Q of £-bit elements, packed in O(n^) words. Each elements q G Q has a (lgm)-bit label 
label ] with lgm < t. The labels are stored in the same 0(n£) words. We can think of each 
word as consisting of two portions, the first containing 0{^j) elements and the second containing 
the corresponding O(y) labels. The output of Split(Q, label) is a collection of sublists, so that 
all elements of Q with the same label are put in the same sublist (in arbitrary order). 

In addition, we will need SplitQ to be reversible. Suppose the labels in the sublists have been 
modified. We need a subroutine Unsplit(Q), which outputs Q in the original order before Split(), 
but with the modified labels attached. 

The following lemma states the time bound we will use for these two operations. The imple- 
mentation of SplitQ is taken from a paper by Han [12] and has been also used as a subroutine in 
several integer sorting algorithms [131 114j. As far as we know, the observation that UnsplitQ is 
possible in the same time bound has not been stated explicitly before. 

Lemma 3. Assume label [g] G [m] for all q G Q, and let M be a parameter. If y lgm < \ lgM 
and lgm < £ < w, both Split() and UnsplitQ require time 0(n^ lg j + M). 

Proof: Let g = y. Each word contains g elements, with g lgm bits of labels. Put words with 
the same label pattern in the same bucket. This can be done in 0(n/g + \fM) time, since the 
number of different label patterns is at most 2 9lgm < \[M. For each bucket, we form groups of g 
words and transpose each group to get g new words, where the i-th element of the j-th new word is 
the j-th element of the i-th old word. Transposition can be implemented in 0(lgg) standard word 
operations [21]. Elements in each new word now have identical labels. We can put these words in 
the correct sublists, in 0(n/g + m) time. There are at most g leftover elements per bucket, for a 
total of 0{s/M~g) = o(M); we can put them in the correct sublists in o(M) time. The total time is 
therefore 0((n/g) lg g + M). 

To support unsplitting, we remember information about the splitting process. Namely, whenever 
we transpose g words, we create a record pointing to the g old words and the g new words. To 
unsplit, we examine each record created and transpose its g new words again to get back the g 
old words (with labels now modified). We can also update the leftover elements by creating o(M) 
additional pointers. □ 

A particularly easy application of this machinery is to implement the algorithm of Section [2] 
with standard operations (with a minor lglgm slowdown). This result is not interesting by itself, 
but it will be used later as the base case of our bootstrapping strategy. 

Corollary 4. // j- lg b < ^ lg M and lg b < h < w, the algorithm for SLABoQ from Lemma\^ can be 
implemented on a word RAM with standard operations in time Tb(n, b, h) = 0(n^ Igb lg jf + bM). 

Proof: The nonstandard operations used before were splitting and unsplitting a set of points 
packed in a word, depending on sidedness with respect to a segment. It is not hard to compute 
sidedness of all points from a word in parallel using standard operations: we apply the linear map 
defining the support of the segment to all points (which is a parallel multiplication and addition) , 
and keep the sign bits of each result. The sign bits define 1-bit labels for the points, and we can 
apply SplitQ and UnsplitQ for these. □ 

Since the algorithm is used with b = \/Ig m and h = w/y/Tgrn, we incur a slowdown of 0(lg -jr) = 
O (lglgm) per point compared to the implementation with nonstandard operations. By setting 
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M = m 2 , the algorithm of the previous section would then run in time 0(n^\g m lg lg m + m 3 ) if 
implemented with standard operations. (The dependence of the second term on m can be lowered 
as well.) 

3.2 The Improved Algorithm 

Our fastest algorithm follows the same pseudocode of Figure [21 but with a more careful implemen- 
tation of the individual steps. Let £ be the number of bits per point and m the original number of 
segments in the root call to SlabQ. We have lgm < £ < w. In a recursive call to SlabQ, the input 
consists of some n points and m < m segments, all with coordinates from [2^], where t < £. Points 
will be packed in 0(£) bits each, so the entire set Q occupies O(n^) words. At the end, the output 
ANS[g] is encoded as a label with lgm bits, stored within each point q £ Q, with the order of the 
points unchanged in the list Q. Note that one could think of repacking more points per word as £ 
and m decrease, but this will not yield an asymptotic advantage, so we avoid the complication (on 
the other hand, repacking before the call to SlaboQ is essential). 

In step 2, we can compute ROUND(<£>((5)) in time linear in the number of words O(n^), by using 
a nonstandard word operation that applies the projective transform (and rounding) to multiple 
points packed in a word. Unfortunately, it does not appear possible to implement this efficiently 
using standard operations. We will deal with this issue in Section by changing the algorithm for 
SlabQ so that we only require affine transformations, not projective transformations. 

Before the call to SlabqQ in step 3, we need to condense the packing of the points ROUND(^(Q)) 
to take up O(n^) words. Previously, we had O(j) points per word, but after step 2, only 0(h) 
bits of each point were nonzero. We will stipulate that points always occupy an number of bits 
which is a power of 2. This does not affect the asymptotic running time. Given this property, we 
obtain a word of ROUNd(^(Q)) by condensing £/h words. This operation requires shifting each old 
word, and ORing it into the new word. 

Note that the order of round(<£>(<5)) is different from the order of Q, but this is irrelevant, 
because we can also reverse the condensing easily. We simply mask the bits corresponding to old 
word, and shift them back. Thus, we obtain the labels generated by SlaboQ in the original order 

of Q. Both condensing and its inverse take O(n^) time. 

For the remainder of the steps, we need to SplitQ and UnsplitQ. For that, we fix a parameter 
M satisfying jlgrh < |lgM. In step 4, we first split the list ROUNd(<^(Q)) into sublists with the 
same ANS labels. For each sublist, we can perform the constant number of comparisons per point 
required in step 4, and then record the new ANS labels in the list, in time linear in the number of 
words O(n-). It is standard to implement this in the word RAM by parallel multiplications (see 
the proof of Lemma|3|. To complete step 4, we Unsplit() to get back the entire list ROUND ((p(Q)), 
and then copy the ANS labels to the original list Q in O(n^) time. Since both lists are in the same 
order, this can be done by masking labels and ORing them in. 

To perform step 5, we again split Q into sublists with the same ANS labels. After the recursive 
calls, we unsplit to get Q back in the original order, with the new ANS labels. 

3.3 Analysis 

For ™lgm < ^lgM and lgm < £ < w, the recurrence ([TJ now becomes: 
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T(n,m,£) = T (n,b',h) + O (n-lgZ + M) + Vr^m^A), (2) 

\ w £ J U 

where b' = 0(b), ^ i rij = n, Y^i m i = m ~ b', and for each i, we either have rrii < ^ or £i < £ — ^. 

As before, the depth of the recursion is bounded by 0(log fe m + ^). 

Assume that for j- lg b < \ lg M and lg b < /i < w, an algorithm with running time 



T (n, 6, fc) < c k L^- lg 1 /* b lg (~ lg 6) + bAfj 



is available to begin with. This is true for k = 1 with ci = 0(1) by Corollary [U 
Then the recurrence ([2]) yields: 



T(n,m,£) = 0(c k ) 



n— lg i//c 6 lg — lg 6 + n— lg ^ 

w \h J w P 



\og h m + — j + mM 



Set lg 6 = lg fc /( fc+1 ) mand/i = £/ lgV(*+i) m . Notice that indeed f lg 6 = | lg m < ± lg M and 
lgfr < h < w. Thus, we obtain an algorithm with running time: 



T(n,fh,£) < c k+1 I n—log 1/{k+1) m IglZlg 
\ w \£ 



gm) + mM 



for some c k+ \ = 0(1) ■ c k . 

Iterating this process k times, we get: 



T(n,m,£) < 2°W I lg 1 ^ m lg (j lg mj + mM 



for any value of k. Choosing k = ylg lg m to asymptotically minimize the expression, and plugging 
in £ = w and M = in 2 (so that indeed j lg m < | lg M), we get: 



T(n,m,™) = 2°(V 1 s 1 g m ) („ + m 3) 

We can reduce the dependence on fh to linear as follows. First, select one out of every m 3 / 4 
consecutive segments of S, and run the above algorithm on just these m 1//4 segments. This takes 
time (n + m 3 / 4 ) time. Now recurse between each consecutive pair of selected segments. 

The depth of the recursion is O(lglgm), and it is straightforward to verify that the running time 
is n- 2°(V 1 s 1 g™) + 0(m). 

4 Avoiding Nonstandard Operations 

The only nonstandard operation used by the algorithm of Section [3] is applying a projective trans- 
form in parallel to points packed in a word. Unfortunately, it does not seem possible to implement 
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this in constant time using standard word RAM operations (since, according to the formula for 
projective transform, this operation requires multiple divisions where the divisors are all different). 

One idea is to simulate the special operation in slightly superconstant time. We can use the 
circuit simulation results of Brodnik et al. [5] to reduce the operation to Igw ■ (lglgw)°^ standard 
operations. For the version of the slab problem in dimension 3 or higher, this is the best approach 
we know. Note that the results from the previous section hold in any constant dimension, by simply 
using the multidimensional analog of Observation [1] from [8]. 

However, in two dimensions we can get rid of the dependence on the universe, obtaining a time 
bound of n ■ 2°^ lg lgm ) + 0(m) on the standard word RAM. This constitutes the object of this 
section. 

4.1 The Center Slab 

By horizontal translation, we can assume the left boundary of our vertical slab is the y-axis. Let 
the abscissa of the right boundary be A. For some h to be determined, let the center slab be the 
region of the plane defined by A/2 h < x < A • (1 — 2~ h ). The lateral slabs are defined in the 
intuitive way: the left slab by < x < A/2 h and the right slab by A • (1 — 2~ h ) < x < A. 

The key observation is that distances are somewhat well-behaved in the center slab, so we will 
be able to decrease both the left and right intervals at the same time, not just one of them. This 
enables us to use (easier to implement) affine maps instead of projective maps. Center slabs were 
also used in one of our previous papers [19], but as presented there, the idea cannot get rid of the 
dependence on the universe. This paper's definition and use of the center slab is rather different, 
and gets rid of this dependence. 

The following is a replacement for Observation [TJ 

Observation 5. Fix b and h. Let S be a set of m sorted disjoint segments, such that all left 
endpoints lie on an interval II and all right endpoints lie on an interval Ir, where both II and 
Ir have length 2^. In 0(b) time, we can find 0(b) segments sq,s\,... 6 S in sorted order, which 
include the lowest segment of S, such that: 

(1) for each i, at least one of the following holds: 

(la) there are at most m/b segments of S between Si and Sj+i- 

(lb) both the left and right endpoints of Si and Sj+i are at distance at most 2^ h . 

(2) there exist segments sq -< §x -< • • • cutting across the slab, satisfying all of the following: 

(2a) distances between the left endpoints of the Si's are multiples of 2^ 2h . 

(2b) ditto for the right endpoints. 

(2c) inside the center slab, sq ~< s~o ~< S2 ~< S2 -< • • • • 

Proof: Let B contain every [ttt, / 6J -th segment of S, starting with the lowest segment so- We 
define Sj+i inductively. If the next segment after S{ has either the left or right endpoints at distance 
greater than 2 i_h , let Sj+i be this segment, which satisfies (la). Otherwise, let Sj+i be the highest 
segment of B which satisfies (lb). 

Now impose grids over II and Ir, both consisting of 2 2h subintervals of length 2^~ 2h . We 
obtain §i from s^ by rounding each endpoint to the grid point immediately above. This immediately 
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Figure 3: Proof of Observation [5j a center slab. 

implies Sq -< s\ -< • • • and s, -< Sj. Unfortunately, and Si + k may intersect for arbitrarily large k 
(e.g. Sj, . . . , Sj+fc are very close on the left, while each consecutive pair is far on the right). However, 
we will show that inside the center slab, Sj ~< Si + 2- (See Figure [3l) 

By construction, S{ and are vertically separated by more than 2 l ~ h either on the left or 
on the right. Since lateral slabs have a fraction of 2~ h of the width of the entire slab, the vertical 
separation exceeds 2 e ~ h /2 h = 2^~ 2h anywhere in the center slab. Rounding Sj to Sj represents a 
vertical shift of less than 2 e ~ 2h anywhere in the slab. Hence, Sj -< in the center slab. □ 

We now describe how to implement Slab(), assuming the intervals containing the left endpoints 
(II) and the right endpoints (Ir) both have length 2 £ . In this section, we only deal with points in 
the center slab. It is easy to Split() Q into subsets corresponding to the center and lateral slabs, 
and UnsplitQ at the end. 

We use Observation [5] instead of Observation [TJ Since II and Ir have equal length, the map 
(p is affine. Thus, it can be implemented using parallel multiplication and parallel addition. This 
means step 2 can be implemented in time 0(n^-) using standard operations. 

Because we only deal with points in the center slab, and there Sj -< s~i -< s^+2 (just like in the 
old Observation [T]) , steps 4 and 5 work in the same way. 

4.2 Lateral Slabs 

To deal with the left and right slabs, we use the following simple observation, which we only state 
for the left slab by symmetry. Note that the guarantees of this observation (for the left slab) are 
virtually identical to that of Observation [5] (for the center slab). Thus, we can simply apply the 
algorithm of the previous section for the left and right slabs. 

Observation 6. Fix b and h. Let S be a set of m sorted disjoint segments, such that all left 
endpoints lie on an interval II and all right endpoints lie on an interval Ir, where both II and 
Ir have length 2 . In 0(b) time, we can find 0(b) segments to,t\, . . . e S in sorted order, which 
include the lowest segment of S, such that: 

(1) for each i, at least one of the following holds: 

(la) there are at most m/b segments of S between s, and Sj+i. 

(lb) anywhere in the left slab, the vertical separation between S{ and Sj+i is less than 2^~ h+l . 
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Figure 4: Proof of Observation [6j a left slab. 



(2) there exist segments t, o -< 1 1 -<, ■ ■ ■ cutting across the left slab, satisfying all of the following: 

(2a) distances between the left endpoints of the U's are multiples of2 e ~ 2h . 

(2b) ditto for the right endpoints. 

(2c) inside the left slab, to ~< to ~< t2 ~< ti -< ■ ■ ■ . 

Proof: Let I a be the vertical interval at the intersection of the right edge of the left slab with 
the parallelogram defined by II and Ir. Note I a also has size 2 £ . 

Let B contain every Lm/feJ-th segment of S, starting with the lowest segment to- Given ti, we 
define ij+i to be the highest segment of B which has the left endpoint at distance at most 2 i ~ h 
away. If no such segment above ti exists, let ti+i be the successor of ti in B (this will satisfy (la)). 
In the first case, (lb) is satisfied because the right endpoints of ti and ij+i are at distance most 2 e , 
so on I A , the separation is at most 2 l ~ h (l - 2~ h ) + 2 e ■ 2~ h < 2 t ~ h+1 . 

Now impose grids over II and Ia, both consisting of 2 h+1 subintervals of length 2 l ~ h ~ 1 . We 
obtain ti from ti by rounding the points on Ii and Ia to the grid point immediately above. Note 
that the vertical distance between ti and ti is less than 2^~ h ~ 1 anywhere in the left slab. On the 
other hand, the left endpoints of ti and i,+2 are at distance more than 2 e ~ h . The distance on I a 
(and anywhere in the left slab) is at least 2 e ~ h (l - 2~ h ) > 2^~ h ~ 1 . Thus t» t i+2 . (See 

Figured!) □ 

4.3 Bounding the Dependence on m 

Our analysis needs to be modified, because segments are simultaneously in the left, center and 
right slabs, so they are included in 3 recursive calls. In other words, in recurrence ([2]), we have to 
replace Y2i 171 = 171 ~ b' with a weaker inequality ^ m < 3m. Recall that for our choice of b and 
h, the depth of the recursion is bounded by 0(log ft m + r) = 0(lg 1//( - fc+1 ^ m). Thus, the cost per 
segment is increased by an extra factor of 3 0( - lg< /( + ' m ) = 3°(\/ 1 s m ) for each bootstrapping round; 
the cost per point does not change. With k = ydg lg fh rounds, the overall dependence on fh is 
now increased slightly to 2°(\ /l g 1 s™) . m 3 • 30(y/i^fhigigm)^ = o(fh 3+£ ). As before, this can be made 
0(m) by working with m 1//4 segments and recursing. 
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5 Open Problems 



Though our algorithm for offline point location is deterministic, the reductions to other problems [8] 
introduce randomization. It would be nice to derandomize them. Another interesting question is 
whether some of these reductions also hold in the other direction. For example, the trapezoidal 
decomposition problem of disjoint line segments is clearly no easier than offline point location, but 
can the same be said for other problems like 3-d convex hulls? 

A problem with an O(nlgn) upper bound that we currently cannot improve is counting in- 
tersections between a set of red segments and a set of blue segments, given that no segments of 
the same color intersect [j5J. Note that this problem is no easier than counting inversions in a 
permutation, which takes 0(ny/\gn) time by the best known algorithm [7]. 

Finally, the complexity of the online point location problem remains open: can the 0( ^j™ n ) 

and O(^jJ^y) upper bounds from [8j be improved, or can stronger lower bounds be proved? 
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